Skip to content

[#441] Preload embedding model at startup#461

Open
sahilds1 wants to merge 7 commits intoCodeForPhilly:developfrom
sahilds1:441-embedding-models
Open

[#441] Preload embedding model at startup#461
sahilds1 wants to merge 7 commits intoCodeForPhilly:developfrom
sahilds1:441-embedding-models

Conversation

@sahilds1
Copy link
Collaborator

@sahilds1 sahilds1 commented Feb 13, 2026

Description

  • Preload SentenceTransformer model at Django startup before traffic is routed to the application instance
  • Add tests for the embeddings services by pulling apart the core logic to make testing easier

Related Issue

GitHub Issue #441

Manual Tests

sahildshah•~/github/balancer-main(441-embedding-models⚡)» docker compose up --build                                                                                        [15:15:29]
backend-1   | INFO 2026-03-10 19:16:04,973 _client 168 281473584205120 HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/sentence-transformers/paraphrase-MiniLM-L6-v2/c9a2bfebc254878aee8c3aca9e6844d5bbb102d1/config.json "HTTP/1.1 200 OK"
Loading weights: 100%|██████████| 103/103 [00:00<00:00, 4545.93it/s, Materializing param=pooler.dense.weight]

Automated Tests

sahildshah•~/github/balancer-main(441-embedding-models⚡)» docker compose exec backend pytest api/services/test_embedding_services.py -v                                      [15:30:11]
================================================================================== test session starts ==================================================================================
platform linux -- Python 3.11.4, pytest-9.0.2, pluggy-1.6.0 -- /usr/local/bin/python
cachedir: .pytest_cache
django: version: 4.2.3, settings: balancer_backend.settings (from ini)
rootdir: /usr/src/server
configfile: pytest.ini
plugins: django-4.11.1, anyio-4.12.1
collected 4 items

api/services/test_embedding_services.py::test_evaluate_query_maps_fields PASSED                                                                                                   [ 25%]
api/services/test_embedding_services.py::test_evaluate_query_none_upload_file PASSED                                                                                              [ 50%]
api/services/test_embedding_services.py::test_log_usage_computes_distance_stats PASSED                                                                                            [ 75%]
api/services/test_embedding_services.py::test_log_usage_swallows_exceptions PASSED                                                                                                [100%]

=================================================================================== 4 passed in 0.03s ===================================================================================

Documentation

Updated README with instructions for running backend tests

Reviewers

@taichan03 @amahuli03

Notes

@sahilds1 sahilds1 changed the title REFACTOR Pull apart get_closest_embeddings to make testing easier [DRAFT] [#441] Embedding Models Feb 13, 2026
@sahilds1 sahilds1 self-assigned this Feb 13, 2026
@sahilds1 sahilds1 changed the title [DRAFT] [#441] Embedding Models [WIP] [#441] Embedding Models Feb 13, 2026
@sahilds1 sahilds1 changed the title [WIP] [#441] Embedding Models [#441] Preload embedding model at startup Feb 27, 2026
@sahilds1 sahilds1 requested review from amahuli03 and taichan03 March 10, 2026 19:34
@sahilds1 sahilds1 marked this pull request as ready for review March 10, 2026 19:37
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to ensure the SentenceTransformer embedding model is loaded during Django startup (before traffic hits the instance) and to make the embeddings search logic more testable by factoring it into smaller functions.

Changes:

  • Refactors get_closest_embeddings by extracting query building, evaluation, and usage logging into helper functions.
  • Adds pytest + pytest-django support (requirements + pytest.ini) and new unit tests for embedding service helpers.
  • Updates GitHub Actions workflow and README to run backend tests.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
server/requirements.txt Adds pytest dependencies to support running backend tests.
server/pytest.ini Configures pytest-django settings/module and python path for the server package.
server/api/services/test_embedding_services.py Adds unit tests for query evaluation and usage logging helpers.
server/api/services/embedding_services.py Refactors embeddings search into build_query, evaluate_query, log_usage, and reworks get_closest_embeddings.
server/api/apps.py Attempts to preload the embedding model during Django app initialization via ready().
README.md Documents how to run backend tests inside the backend container.
.github/workflows/python-app.yml Changes CI branch targets and adds dependency install + pytest execution.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines 15 to +60
@@ -32,59 +31,52 @@ def get_closest_embeddings(
Returns
-------
list[dict]
List of dictionaries containing embedding results with keys:
- name: document name
- text: embedded text content
- page_number: page number in source document
- chunk_number: chunk number within the document
- distance: L2 distance from query embedding
- file_id: GUID of the source file
QuerySet
Unevaluated Django QuerySet ordered by L2 distance, sliced to num_results
"""

encoding_start = time.time()
transformerModel = TransformerModel.get_instance().model
embedding_message = transformerModel.encode(message_data)
encoding_time = time.time() - encoding_start

db_query_start = time.time()

# Django QuerySets are lazily evaluated
if user.is_authenticated:
# User sees their own files + files uploaded by superusers
closest_embeddings_query = (
Embeddings.objects.filter(
Q(upload_file__uploaded_by=user) | Q(upload_file__uploaded_by__is_superuser=True)
)
.annotate(
distance=L2Distance("embedding_sentence_transformers", embedding_message)
)
.order_by("distance")
queryset = Embeddings.objects.filter(
Q(upload_file__uploaded_by=user) | Q(upload_file__uploaded_by__is_superuser=True)
)
else:
# Unauthenticated users only see superuser-uploaded files
closest_embeddings_query = (
Embeddings.objects.filter(upload_file__uploaded_by__is_superuser=True)
.annotate(
distance=L2Distance("embedding_sentence_transformers", embedding_message)
)
.order_by("distance")
)
queryset = Embeddings.objects.filter(upload_file__uploaded_by__is_superuser=True)

queryset = (
queryset
.annotate(distance=L2Distance("embedding_sentence_transformers", embedding_vector))
.order_by("distance")
)

# Filtering to a document GUID takes precedence over a document name
if guid:
closest_embeddings_query = closest_embeddings_query.filter(
upload_file__guid=guid
)
queryset = queryset.filter(upload_file__guid=guid)
elif document_name:
closest_embeddings_query = closest_embeddings_query.filter(name=document_name)
queryset = queryset.filter(name=document_name)

# Slicing is equivalent to SQL's LIMIT clause
closest_embeddings_query = closest_embeddings_query[:num_results]
return queryset[:num_results]
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build_query() introduces/relocates important filtering + precedence logic (authenticated vs unauthenticated visibility; guid-over-document_name; LIMIT slicing), but the new tests only cover evaluate_query and log_usage. Add unit/integration tests covering build_query behavior (e.g., guid precedence and the authenticated/unauthenticated queryset filters) to prevent regressions in access control and filtering.

Copilot uses AI. Check for mistakes.
Comment on lines +8 to +10
def ready(self):
from .services.sentencetTransformer_model import TransformerModel
TransformerModel.get_instance()
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ApiConfig.ready() will only run if this AppConfig is actually used by Django. Right now INSTALLED_APPS appears to include just "api" (not "api.apps.ApiConfig"), and api/__init__.py doesn’t set a default config, so this preload hook may never execute. Consider updating INSTALLED_APPS to reference api.apps.ApiConfig (or otherwise ensuring this config is selected) so the model is preloaded as intended.

Copilot uses AI. Check for mistakes.
Comment on lines +8 to +10
def ready(self):
from .services.sentencetTransformer_model import TransformerModel
TransformerModel.get_instance()
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling TransformerModel.get_instance() unconditionally in ready() will run for every Django startup context (tests, migrations, management commands, autoreload) and can trigger a large model download/init even when no web traffic will be served. Consider gating this preload behind an explicit env flag (or limiting it to the web server entrypoint) to avoid slowing/fragilizing CI and one-off management commands.

Copilot uses AI. Check for mistakes.
run: pip install -r server/requirements.txt
# Discover and run all files matching test_*.py or *_test.py under server/
- name: Run tests
run: pytest server/ -v
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest.ini was added under server/, but this workflow runs pytest server/ -v from the repo root. Pytest won’t automatically discover config files in subdirectories, so DJANGO_SETTINGS_MODULE/pythonpath may not be applied and Django tests can fail to initialize. Fix by either cd server && pytest -v or running pytest -c server/pytest.ini server/ -v (or setting DJANGO_SETTINGS_MODULE in the workflow env).

Suggested change
run: pytest server/ -v
run: pytest -c server/pytest.ini server/ -v

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants